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We present an analysis of sets of matrices with rank less than or equal to a specified number 
s. We provide a simple formula for the normal cone to such sets, and use this to show that 
these sets are prox- regular at all points with rank exactly equal to s. The normal cone formula 
appears to be new. This allows for easy application of prior results guaranteeing local linear 
convergence of the fundamental alternating projection algorithm between sets, one of which is 
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1 Introduction 

Rank optimization is a well-developed topic that has found a tremendous number of applications 
in recent years (see [23| and references therein). Most of the problems one encounters involve a 
linear data model that is underdetermined and the very poorly behaved "sparsity function" , either 
the function determining the rank of a matrix or the function counting the number of nonzero 
entries in an array. A common approach to solving sparsity optimization problems is via a convex 
surrogate, most often the ii or (in the case of matrices) the nuclear norm. The rational for working 
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with such surrogates is that the original problem is NP-compIete, and thus should be avoided. 
Inspired by earlier work proving local convergence of cyclic projections onto nonconvex sets with 
an application to sparse signal recovery [6j, and a more recent projection- reflection algorithm for 
x-ray imaging |20| that appears to be very successful at working with a proximal operator of the 
Iq function and a nonlinear imaging model, we set out in the present note to determine whether 
sets with sparsity constraints have some sort of regularity that might justify working directly with 
sparsity rather than through convex surrogates. 



Based on the work of Lewis and Sendov 15 , 16 , Le has obtained explicit formulas for the 



generalized rank function 11 . This formula shows that every point of the rank function is a 
critical point ^ , and so reasonable algorithmic strategies should not directly make use of the rank 
function. Instead, we consider the lower level sets of the rank function. While sets of matrices of 
rank less than a specified level are not manifolds, we show here that they are quite regular, in fact 
prox-regular. While prox-regularity of these sets is not new |14l, our proof of this fact established 



in Section [3] uses elementary tools, at the center of which is a particularly simple and apparently 



new characterization of the normal cone to these sets established in Proposition 3.6 



Prox-regularity of the lower level sets of the rank function immediately yields local linear 
convergence of fundamental algorithms for either finding the intersection of the rank constraint set 
with another set determined by some (nonlinear) data model, or for minimizing the distance to a 
rank constrained set and a data set. The result, detailed in Section [4j is quite general and extends 
to nonconvex data imaging models with rank constraints. Our results are an extension of results 
established recently in [3] for the vector case, however at the cost of additional assumptions on the 
regularity of the solution set. In particular, |3] establishes local linear convergence, with radius 
of convergence, of alternating projections between an affine constraint and the set of vectors with 
no more than s nonzero elements without any assumptions on the regularity of the intersection of 
these sets, beyond the assumption that it is nonempty. Our results, in contrast, are modeled after 



results of 14 and 13 where a stronger regularity of the intersection is assumed. We discuss the 
difficulties in extending the tools developed in f4^ to the matrix case in the conclusion. In any case, 
avoiding convex surrogates is at the cost of global convergence guarantees: these results are local 
and offer no panacea for solving rank optimization problems. Rather, this analysis shows that 
certain macro-regularity assumptions such as restricted isometry or mutual coherence (see |23| and 
references therein) play no role asymptotically in the convergence of algorithms, but rather have 
bearing only on the radius of convergence. We begin this note with a review of notation and basic 
results and definitions upon which we build. 



2 Notation 



Throughout this paper X and y are Euclidean spaces. In particular we are interested in Euclidean 
spaces defined on M™"^" where we derive the norm from the trace inner product 

(y, x) := TV (y^x) for x, y G M""^", ||x|| := ^Tr{x^x). 

This naturally specializes to the case of when m = n above and x G M"^" is restricted to the 
subspace of diagonal matrices. For x G M™^"- we denote the span of the rows of x by range(a;"^) 
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and recall that this is orthogonal to the nullspace of the linear mapping x : — )• , 

range(x"^) = ker(a;)"'". 

For j; G G M"^" | Zij = if i 7^ j'} (that is, when x is square diagonal) this corresponds exactly 
to the usual support of vectors on : 

range(x^) = supp (Diag (x)) := {y € M" I yi = for all i G {1, 2, . . . , n} with Diag(2;)j = 0} 

where Diag (x) maps the diagonal of the matrix x € R*"^" to a vector in W with r = min{m, n}. 
In order to emphasize this connection to the support of vectors, and reduce notational clutter we 
will denote the span of the rows of x by 

Supp {x) := range(x^). 

We denote the rank of x by rank(j;) and recall that rank(x) is the dimension of the span 
of the columns - or equivalently the rows - of x which is equivalent to the number of nonzero 
singular values. The singular values of x G M™-^"- are the (positive) square root of the eigenvalues 
of xx^; these are denoted by (Jj{x) and are assumed to be ordered so that ai{x) > aj{x) for 
i < j. We denote by ^(x) := (cti(x), (T2(x), . . . , (Tr(x))-^ (r = min{m, n}) the ordered vector of 
singular values of x. The corresponding diagonal matrix is denoted 5](x) := diag {(j{x)) £ i^™-x" 
where diag (•) maps vectors in W to matrices in M™'^". Following |12||15|[T6] we denote the 
(Lie) group of n x n orthogonal matrices by 0{n) and the product 0{m) x 0{n) by 0{m,n). 
A singular value decomposition of x G j^mxn ^gg^pictg^^ ^j^e above ordering is then any pair 
of orthogonal matrices {U,V) G 0{m,n) together with S(x) such that x = UTi{x)V'^ . We will 
denote the set of pairs of orthogonal matrices that comprise singular systems for x by hl{x) := 
{ (f/, y) G 0(m, n) I X = UY.{x)V^ ] . 

The closed ball centered at x with radius p is denoted by B(x,p); the unit ball centered at the 
origin is simply denoted by B. Given a set $7 C A", we denote the distance of a point x G to 
by dQ,{x) where 

d^{x) := inf \\y - x\\. 

yen 

If is empty then we use the convention that the distance to this set is +00. The corresponding 
(multivalued) projection operator of x onto denoted Pq{x), is defined by 

Pnix) '■= argmin \\z — x\\. 
If Q is nonempty and closed, then the projection of any point in X onto 0, is nonempty. 



We define the normal cone to a closed set 0, C X following 24, Def. 6.3]: 



Definition 2.1 (normal cone) A vector v £ X is normal to a closed set C X at x £ Q, 

written v G Nq{x) if there are sequences {x^)keN if^ ^ with x^ x and {v^)keN if^ ^ with — )• v 
such that 

(v^, X — x^) 
limsup — I rr. — - < 0. 
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The vectors v are regular normals to Q at x and the cone of regular normals at x is denoted 



What we are calling regular normals are called Frechet normals in [19| Def. 1.1]. 

Here and elsewhere we use the notation x -^x to mean that x ^x with x E 0. An important 
example of a regular normal is a proximal normal, defined as any vector v ^ X that can be written 
as, V = \{x — x) for A > and x G Pq,{x) for some x ^ X. We denote the set of proximal normals 
to at X G by Nq{x). For Q closed and nonempty, any normal v G Nq{x) can be approximated 
arbitrarily closely by a proximal normal [24| Exercise 6.18]. Thus we have the next result which is 
key to our analysis. 

Proposition 2.2 (Theorem 1.6 of [19]) Let Q C X be closed and x G 0. Then 
Nn{x) = 

ji; G M" 3 sequences x^ ^x and v with G cone {x^ - Pn{x^)) for all k . (2.1) 



Central to our results is the regularity of the intersection of sets, which we define in terms 
of a type constraint qualification formulated with the normal cones to the sets at points in the 
intersection. 

Definition 2.3 (basic set intersection qualification) A family of closed sets 0,i,^}2, ■ ■ ■ 
C X satisfies the basic set intersection qualification at a point x G Oi^li, if the only solution to 

m 

^yi = G, yi&NnM (^ = 1,2, . . . ,m) 

i=l 

is yi = for i = 1,2, ... ,m. We say that the intersection is strongly regular at x if the basic set 
constraint qualification is satisfied there. 



In the case m = 2, this condition can be written 

Nn,ix)ri-Nn,ix) = {0}. 

The two set case is called the basic constraint qualification for sets in \19\ Definition 3.2] and has 
its origins in the the generalized property of nonseparability [18] which is the n-set case. It was 
later recovered as a dual characterization of what is called strong regularity of the intersection 
in |10[ Theorem 3]. It is called linear regularity in [13] . 

The case of two sets also yields the following simple quantitative characterization of strong 
regularity. 

Proposition 2.4 (Theorem 5.16 of |13| ) Suppose that^i and^2 are closed subsets of X . The 
intersection Vli n ^2 satisfies the basic set intersection qualification at x if and only if the constant 

c := sViY>{{u, v) \ue Na^{x)r\M, V e -N^.^{x)r\M] <l. (2.2) 
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Definition 2.5 (angle of regular intersections) Suppose that Vli and 0.2 are closed subsets of 
X . We say that the intersection Qi n is strongly regular a.i x ^ VLi r\ VL2 with angle 9 := 



cos ^{c) > when the constant c given by (2.2) is less than 1. 



We will also require certain regularity of the sets themselves, not just the intersection. The 
following definition of prox-regularity of sets is a modern manifestation that can be traced back 
to [7] and sets of positive reach. What we use here as a definition actually follows from the 
equivalence of prox-regularity of sets as defined in 22, Definition 1.1] and the single- valuedness of 
the projection operator on neighborhoods of the set 22 Theorem 1.3]. 



Definition 2.6 (prox-regularity) A nonempty closed set Q C X is prox-regular at a point x S 
if Pc{x) is .single-valued around x. 



3 Properties of lower level sets of the rank function 

We collect here some facts that will be used repeatedly in what follows. 

Proposition 3.1 For any point x € R™^" and any sequence (x'^)^^^ converging to x there is a 
X G N such that rank(2;) < rank(a;^) for all k > K. 

Proof. This follows immediately from continuity of the singular values as a function of x. (See, for 
instance, [oj Appendix D].) □ 

For the remainder of this note we will consider real mxn matrices and denote by r the 
minimum of {m,n}. The rank level set will be denoted by S := {y £ R'^^" | rank(?/) < s} for 
s G {0, 1, . . . , r}. As can be found in textbooks on matrix analysis, the projection onto this set is 
just the truncation of the r — s smallest singular vectors to zero; in the case of a tie for the s-th 
largest singular value, the projection is the set of all s-selections from the s-largest singular values. 

Lemma 3.2 (projection onto S) For x G R™^", define 

S,(x) := diag a2{x), 0, . . . , 0)'^) G R"^^". 

The projection Psix) is given by 

Ps{x) = \J {y \ y = U^s{x)V^}. 

{U,v)eu{x) 

Proof. By fo", Theorem 7.4.51] any matrix y G S satisfies ||x — y\\ > — The re- 

lation holds with equality whenever y = UTjs{x)V'^ for some {U,V) G hl{x), hence Ps{x) D 
U(c/,y)eW(x) {y \ v = UY.s{x)V'^] / 0. On the other hand, if y G Ps{x), then ||x - y\\ < \\x - y\\ 
for all y G S. In particular, for y = U'Es{x)V'^ with {U, V) G U{x) we have 

-|7|| < ||x - y\\ = ||S(x) - < - < ||x - 
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hence ^^(x) = and y G \J^u,V)&U{x) {v \ V = UT,s{x)V'^ } . □ 

The next results estabhsh that the set S is prox-regular at all points where rank(x) = s. We 
make use of the following tools. For r = min{m, n} define the mappings J : M™'^" x (M+U{+oo}) — >■ 

2{l,2,...,r} . j^mxn ^ [Q, +oo] by 

S{x, a) := {j G {1, 2, . . . , r} | (Tj(x) > a} and as{x) := sup {q | \I{x, a)\ > s} , 

where |JI(x,a)| denotes the cardinality of this discrete set. We define JI(x,+oo) to be the empty 
set. Before proceeding with our results, we collect some observations about these objects. 

Lemma 3.3 

(i) For all s G {1,2, ... ,r} the value of the supremum in the definition of as{x) is bounded and 
attained. If s = then ao{x) = +oo. 

(ii) //|J(x,as)| > s then rank(x) > s > 0. 

(Hi) //rank(x) > s then as > 0. 

(iv) If rank{x) < s <r then as{x) = 0. 

Proof, (i) Since the cardinality of the empty set is zero, the supremum in the definition of oq is 
unbounded. In any case, the cardinality of JI(x, a) is monotonically decreasing with respect to a 
for X fixed from a value of r at a = to for all a > o'i{x). Thus for x fixed as is bounded 
for all s G {1, 2, . . . , r}. The value of a for which the cardinality s > 1 is achieved is attained 
precisely when a = crj{x) for some j. (ii) By definition, at s = 0, ao = +oo and |JI(x,+oo)| := 0, 
so |JI(3;,as)| > s implies that s > and the implication |J(x,as)| > rank(j;) follows immediately, 
(iii) If rank(a;) > s and s = 0, then the result is trivial since ao ■= +oo. If rank(x) > s and s > 
then s G {1, . . . ,r — 1} (it is impossible to have rank greater than r) and there exists an a > 
such that |JI(3;,Q!)| > s + 1. As Os+i is the maximum of these, a^+i > 0. By the argument in (i) 
Us > Os+i which yields the result, (iv) In this case, only by including the zero singular values of 
X can the inequality |JI(x, a)\ > s be achieved, that is by taking a = 0. □ 

Proposition 3.4 (properties of the projection) The following are equivalent. 

(i) Ps{x) is multi-valued; 
(ii) \S{x, as)\ > s. 



Proof. To show that (i) implies (ii), let y and z G Ps{x) with y z. By Lemma 3.2 y = UyT,s{x)V 



and z = UzT.six)V^ for {Uy, Vy) and (Uz, V-,) G U{x). Then by § Theorem 7.4.51] 



rT 

y 







l^(y) - s(^)ll < \\y - A\ < \\y - A\ + Ik 



2||S(x)-S,(x) 



hence rank(a;) > s. Since y ^ z and they have the same singular values, the multiplicity of singular 
values cTj(x) with value as must be greater than one, hence |JI(x,as)| > s. 
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Conversely, to show that (ii) implies (i) first note that by Lemma 3.3 ii) rank(x) > s > 0. Now 
fix y G Ps{^) with y = UyTis{x)Vy for [Uy, Vy) G U{x). The corresponding decomposition for x is 
UyTi{x)Vy . Now construct the orthogonal matrix V by switching the s + I'th column of Vy with 
the s'th col umn of Vy. Since |J[(j;,as)| > s we have that x = UyTi{x)V'^ . Define z := UyT,s{x)V'^ . 



By Lemma 3.2 z G Ps{x) with rank(z) = s, but the s'th column of V is in the nullspace of y so 



z y and the projection is thus multi-valued. This completes the proof. □ 

An immediate consequence of the above is the obvious observation that the projection onto 
the trivial sparsity sets So with s = and Sr with s = r is single- valued. 

Corollary 3.5 For x G M™^", if s = or s = r then Ps{x) is single-valued. 

The normal cone of this set has the following simple characterization. 

Proposition 3.6 (the normal cone to S) At a point x G S 

Ns{x) = |v G M™''" keiiv)-^ D ker(x)-^ = {0} and Tank{v) < r - s } . (3.1) 

Moreover, Ns{x) = N^{x) at every x with rank(x) = s, while N^{x) = {0} at every x with 
rank(x) < s. 

Proof. Using the definition of Supp (x) := ker(2;)-'" define the sets 

W := {v€ M"'^" I Supp {v) n Supp (x) = {0} and rank(t;) < r - s } 
and Z{w) := {z G M™''" | Supp (z) n Supp (w) = {0} and rank(x + z) = s} . 

We first show that W is nonempty and hence Z{w) for w £ W is nonempty. For all x G M*"^" 
and s G {0, 1,2, ... ,r} the zero matrix G W, hence W is nonempty. Next note that for w G W, 
Z{w) C ker{w) with dim(ker(it))) > s > 0, and it is always possible to find an element z of ker(ty) 
with rank(x + z) = s. 

Now, choose any vu £ W and zq G Z{w) and construct the sequences {x'^)keN and {w^)k£N by 



-x + lw + -^zo and = k [x^ - y^^ , for G Ps(x^) {k G N). 



There is a G N such that for all k > K 



- max|(T, (t(;)| < min < cr, ( —= 
k j ^ " j \ ^ V\/fc 



( + X] y^O} . 



Thus for allk> K 

Ps{x'') = x + -^zq and w'' = k (^x'' - (x + -^zq 



w. 
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Note that by Proposition 3.4 and Lemma 3.3 ii) the representation of the projection above holds 



with equahty since rank (^x + ~^wq^ = s. Since x'^ — x, by definition, w £ Ns{x). As w was 
arbitrary, we have W C Ns(x). 

We show next that, conversely, Ns{x) C W for x £ S. The matrix w = trivially belongs to W, 
so we assume that w ^ 0. By Proposition |2.2| we can write it; as a limit of proximal normals, that 
is, the limit of sequences (x^) and {w'^) with x^ ^ S and ^ w for = [x^ — y^^ for G 
Ps{x^)- We consider the corresponding singular value decompositions by y^ = L^fcEg(x^)V"J" for 



{Uk,Vk) G Uix'') and := diag ((cJi(x), f72(2;), . . . , f7,(x), 0, . . . , 0)^) G M™^" (see Lemma [3^ 

Note that x^ and y'^ have the same left and right singular vectors with the usual ordering. The 
matrices Uk and are also collections of left and right singular vectors for , although they do 
not yield the usual ordering of singular values of w^: 



tkUk^s{x'')V;[ for :=diag((0,0,...,0,a,+i(x'=),a,+2(x^),...,cT,(x^))^) 



Let {U,V) G L{{x) be the limit of left and right singular vectors of x*^, that is, Uk — )• C/, 14 — )• 
V where x^ = Uk'^{x^)Vk UT.{x)V = x. Then y^ = Uk^s{x'')Vk C7S(x)F and w'' = 
tkUkT;s{x'^)Vk — )• U (liuik-^ootk'^six'')^ V = w. It follows immediately that rank(tt;) <r — s and 
Supp (w) _L Supp (x) which completes the proof of the inclusion. 

To see that each normal to the set S* at x with rank(x) = s is actually a proximal nor- 



mal, note that if rank(x) = s then by (3.1) every point v G Ns(x) can be written as f = 
^ {{tv + x) — Ps{tv + x)) for r > small enough. Suppose, on the other hand, that rank(x) < s. 
Then Ps{tv + x) = x for r > exactly when v = 0: for if tt; + x G then Ps{tv + x) = tv + x = x 
exactly when v = and if + x ^ S then rank(P5'(rt' + x)) = s hence Psi^v + x) ^ x. 
Consequently the only proximal normal at these points is v = 0. This completes the proof. □ 

The normal cone condition Ns{x) H {—N^{x)) = {0} can easily be checked by determining the 
nullspace of matrices in Nq(x) as the next theorem shows. 

Proposition 3.7 (strong regularity of intersections with a sparsity set) Let Q, C R"*^" 
be closed. If at a point x £ Q, n S all nonzero v G N^{x) have ker(?;)-'- n ker(x)-'- / {0}, then 
the intersection is strongly regular there. 



Proof. Choose any v G Nq{x). Since Supp (u) H Supp (x) ^ {0} and Ns{x) given by (3.1) is a 
subset of matrices w with Supp (w) n Supp (x) = {0} the only solution tov — w = Oisv = vj = 0. 
□ 

It is known that the set of matrices with rank s is a smooth manifold [I] (although the set 
of matrices with rank less than or equal to s is not), from which it follows that S is prox- regular 
[14[ Lemma 2.1 and Example 2.3]. We present here a simple proof of this fact based on the 
characterization of the normal cone. 

Proposition 3.8 (prox-regularity of S) The set S is prox-regular at all points x G R™^" with 
rank(x) = s. 
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Proof. Let (a;^)^^^^ in M™^" be any sequence converging to x with the corresponding sin- 
gular value decomposition Uk^{x'^)Vj[ . Decompose x'^ into the sum + = x^ where 
= UkT.,{x)V^ and = Ukf.s{x^)V^ with ■= (fTi(x'=), a2(x^'), • • • , f^s(x^), . . . , O)^ and 

T,s{x'') := {O, . . . ,0,as+i{x^),crs+2{x^), ■ ■ ■ ,o'r{x^))^ for r = minj m, n| . Note that y^ x with 



rank(y^) = rank(x) = s for all k large enough, while by Proposition 3.6 z^ ^ with z'' S Nsiy''^ 



for all k. Then for all k large eno ugh ma.Xj{aj{z )} = as+i{x ) < <Ts(x ) = m.mj{aj{y )} and 
|JI(a;*', as)| = s. By Proposition 3.4 the projection Psix'') is single-valued. Since the sequence was 
arbitrarily chosen, it follows that the projection is single- valued on a neighborhood of x, hence S 
is prox-regular. □ 



4 Algorithms for optimization with a rank constraint 



The prox-regularity of the set S has a number of important implications regarding numerical algo- 
rithms. Principal among these is local linear convergence of the elementary alternating projection 
and steepest descent algorithms. There has been a tremendous number of articles published in 
recent years about convex (and nonconvex) relaxations of the rank function, and when the solution 
of optimization problems with respect to these relaxations corresponds to the optimization problem 
with the rank function (see the review article [23] and references therein) . The motivation for such 
relaxations is that there are polynomial-time algorithms for the solution of the relaxed problems, 
while the rank minimization problem is NP-complete. As we will show in this section, the above 
theory implies that in the neighborhood of a solution there are polynomial-time algorithms for the 
solution of optimization problems with rank constraints. This observation was anticipated in [2] 
and notably [5j where a (globally) linearly convergent projected gradient algorithm with a rank 
constraint was presented. Without further assumptions, however, such assurances of convergence 
of algorithms for problems with rank constraints is at the cost of global guarantees of convergence. 



4.1 Inexact, extrapolated alternating projections 

To the extent that the singular value decomposition can be computed exactly, the projection of a 
point X onto the rank lower level set S can be calculated exactly simply by ordering the singular 
values of x and truncating. The above analysis immediately yields local linear convergence of 
exact and inexact alternating projections for finding the intersection S D M for Af closed on 
neighborhoods of points where the intersection is strongly regular. The following algorithm allows 
for inexact evaluation of the fixed point operator, and hence implementable algorithms. 

Algorithm 4.1 (inexact alternating projections |17|) Fix 7 > and choose x^ ^ S and 
x^ G M. For k = 1,2,3, .. . generate the sequence (x^'^) in S with x^^ G ^^(x^'^"^) where 
the sequence (x^^"'"^) in M satisfies 

||^2fc+l_^2fc|| < ||3.2fc_^2fc-l||^ 
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for 

2fc+l _ p ^ ^ ^ (^2k-. 

and 

\o if xf +1 = x' 



For 7 = and x^'^'^^ = x1^~^^ the inexact algorithm reduces to the usual alternating projections 
algorithm. Note that the odd iterates x'^^'^^ can lie on the interior of M. This is the major 



difference between Algorithm 4.1 and the one specified in [13] where all of the iterates are assumed 
to lie on the boundary of M. We include this feature to allow for extrapolated iterates in the case 
where M has interior. 

Theorem 4.2 (inexact alternating projections with a rank lower level set) Let 

M,S C M™^" be closed with S := {y | rank(?/) < s} and suppose there is an x £ M Ci S with 
rank(x) = s. Suppose furthermore that M and S have strongly regular intersection at x with angle 
6. Define c := cos{9) < 1 and fix the constants c G (c, 1) and 7 < Vl — . For x*^ and x^ close 



enough to x, the iterates in Algorithm 4-1 converge to a point in M H S with R-linear rate 



Vl - 7^ + 7\/l - < 1- 
If, in addition, M is prox-regular at x, then the iterates converge with rate 

cVl - 7^ + 7V1 - < 1. 



Proof. Since by Proposition 3.8 S is prox regular at x the results follow immediately from 17 
Theorem 4.4]. □ 

Remark 4.3 The above result requires only closedness of the set M. For example, this yields 
convergence for affine sets M = {x \ Ax = b} which are not only closed, but convex. But the above 
result is not restricted to such nice sets. Another important example is inverse scattering with 



sparsity constraints 20 . Here the set Af is M = |x € C" | \{Fx)j\'^ = bj, j = 1,2,. ..,n} where 
F is a linear mapping (the discrete Fourier or Fresnel transform) and b is some measurement (a far 
field intensity measurement). This set is not convex, but it is certainly closed (in fact prox-regular), 
so again, we can apply the above results to provide local guarantees of convergence for nonconvex 
alternating projections with a sparsity set. 



Paradoxically, in the vector case it is the projection onto the affine constraint that in general cannot 
be evaluated exactly, while the projection onto the sparsity set S can be implemented exactly 
by simply (hard) thresholding the vectors. In the matrix case, this is no longer possible since in 
general the singular values cannot be evaluated exactly. In order to accommodate both projections 
being approximately evaluated, we explore one possible solution using a common reformulation of 
the problem on a product space. This is explained next. 
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4.2 Approximate steepest descent 



Another fundamental approach to solving such problems is simply to minimize the sum of the 
(squared) distances to the sets M and S: 



minimize 



\[(f{x,S)+(f{x,M)) 



Steepest descent without line search is: given xq G 
via 



^" generate the sequence {x^)k<m iii 



,1 



X 



V^[d\x\S) + d\x\M) 



If S and M were convex and the distance function the Euclidean distance, it is well-known that 
this would be equivalent to averaged projections: 



x 



k+l 



[d\x^S) + dHx^M)) = \ [Ps{x^) + PMix"") 



(4.2) 



If we assume that M is prox-regular, then, since we have already established the prox-regularity of 
5, the correspondence between the derivative of the sum of squared distances to these sets and the 



projection operators in (4.2) holds on (common) open neighborhoods of M and S [22| Theorem 



1.3]. Using a common product space formulation due to 21 we can show that (4.2) is equivalent 
to alternating projections between the sets 



and 



that is. 



D := {{x,y) £ 
n ■= {{x,y) G M'"^ 



X 



GS,yGM}, 



(x^+l,x^+l) = P^(Pf,((x^x^))) 



where x^~^^ is given by (4.2). The set Q is prox-regular if M and S are, and the set D is convex, 
guarantees local linear convergence of the sequence of iterates (x'^)fcgp^ with rate 



4.2 



so Theorem 

depending on the angle of strong intersection of the sets D and fi. We cannot expect to be able 
to compute the projection onto the set O exactly, but we can reasonably assume to be able to 
compute the projection onto the diagonal D exactly, even if the magnitudes of the elements of 
Ps{x^) and Pm{x^) differ in orders of magnitude beyond our numerical precision. Indeed, since the 
projection operators Ps and Pm are Lipschitz continuous for S and M prox-regular [22[ Theorem 
1.3], we can attribute any error we in fact make in the evaluation of Pd to the evaluation of 



where we compute an approximation according to Algorithm 4.1 Again, Theorem 4.2 guarantees 
local linear convergence with rate governed by the angle of strong regularity between D and and 
the accuracy of the approximate projection onto 0. 



5 Conclusion 



We have developed a novel characterization of the normal cone to the lower level sets of the rank 
function. This enables us to obtain a simple proof of the prox-regularity of such sets. This property 
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then allows for a straight-forward application of previous results on the local linear convergence 
of approximate alternating projections for finding the intersection of rank constrained sets and 
another closed set, as long as the intersection is strongly regular at a reference point x. Our 
characterization of the normal cone to rank constraint sets allows for easy characterization and 
verification of the strong regularity of intersections of these sets with other sets. The results are 
also extended to the elementary steepest descent algorithm for minimizing the sum of squared 
distances to sets, one of which is a rank constraint set. This implies that, in the neighborhood of 
a solution with sufhcient regularity, there are polynomial time algorithms for directly solving rank 
constraint problems without resorting to convex relaxations or heuristics. 

What remains to be determined is the radius of convergence of these algorithms. Using the 
restricted normal cone developed in [4] Bauschke an coauthors [s] obtained linear rates with esti- 
mates of the radius of convergence of alternating projections applied to affine sparsity constrained 
problems - that is, the vector affine case of the setting considered here - assuming only existence 
of solutions. The restricted normal cone is not immediately applicable here since the restrictions 
in [3] are over countable collections of subspaces representing all possible s— sparse vectors. For 
the rank function this is problematic since the collection of all possible s-rank matrices is not 
countable. Extending the tools of [4] to the matrix case is the focus of future research. 

The results one might obtain using the tools of Q or similar, however, are based on the reg- 
ularity near the solution, what we call micro-regularity. We cannot expect the estimates for the 
radius of convergence to extend very far using these tools, unless certain local-to-global properties 
like convexity are assumed. In [5] a scalable restricted isometry property is used to prove global 
convergence of a projected gradient algorithm to the unique solution to the problem of minimizing 
the distance to an affine subspace subject to a rank constraint. The (scalable) restricted isom- 
etry property and other properties like it (mutual coherence, etc) directly concern uniqueness of 
solutions and indirectly provide sufficient conditions for global convergence of algorithms for solv- 
ing relaxations of the original sparsity /rank optimization problem. A natural question is whether 
there is a more general macro-regularity property than the scalable restricted isometry property, 
one independent of considerations of uniqueness of solutions, that guarantees global convergence. 
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